Introduction to R

Laboratory of Statistics and Mathematics 2025/2026

Giuseppe Alfonzetti

Our goal

The data analysis pipeline:

Import

You will learn how to read external data sources from R. In particular, we will focus on

  • Universal text formats, as .csv.
  • Proprietary spreadsheets formats, as Excel .xls and .xlsx;

Tidy

You will know hot to store your data in a cosistent format.

  • Each row is an observation;
  • Each column a variable, with a unique name.
  • Each variable has a specific type (numeric, chracter, logical, etc..).
# A tibble: 234 × 7
   manufacturer displ model      class     cyl trans        hwy
   <chr>        <dbl> <chr>      <chr>   <int> <chr>      <int>
 1 audi           1.8 a4         compact     4 auto(l5)      29
 2 audi           1.8 a4         compact     4 manual(m5)    29
 3 audi           2   a4         compact     4 manual(m6)    31
 4 audi           2   a4         compact     4 auto(av)      30
 5 audi           2.8 a4         compact     6 auto(l5)      26
 6 audi           2.8 a4         compact     6 manual(m5)    26
 7 audi           3.1 a4         compact     6 auto(av)      27
 8 audi           1.8 a4 quattro compact     4 manual(m5)    26
 9 audi           1.8 a4 quattro compact     4 auto(l5)      25
10 audi           2   a4 quattro compact     4 manual(m6)    28
# ℹ 224 more rows

Transform

You will learn how to transform the data at your willingness:

  • Select the variables of interest;
  • Combine existing variables in new variables;
  • Filter relevant observations;
  • Reshape data;
# A tibble: 234 × 3
   displ class     hwy
   <dbl> <chr>   <int>
 1   1.8 compact    29
 2   1.8 compact    29
 3   2   compact    31
 4   2   compact    30
 5   2.8 compact    26
 6   2.8 compact    26
 7   3.1 compact    27
 8   1.8 compact    26
 9   1.8 compact    25
10   2   compact    28
# ℹ 224 more rows

Visualize

You will learn how to explore data patterns with visualisations.

Model

It’s the only step where “math” enters the game. Goes from simple descriptive statistics, to more elaborated modelling strategies. Often combined with visualisations.

Communicate

  • Write reports;
  • Choose appropriate viasualizations;
  • Highilight the results in terms of insights.

Why R?

Replicability!